Pesquisa | Portal Regional da BVS

1.

Independent expansion, selection and hypervariability of the TBC1D3 gene family in humans.

Guitart, Xavi; Porubsky, David; Yoo, DongAhn; Dougherty, Max L; Dishuck, Philip C; Munson, Katherine M; Lewis, Alexandra P; Hoekzema, Kendra; Knuth, Jordan; Chang, Stephen; Pastinen, Tomi; Eichler, Evan E.

bioRxiv ; 2024 Mar 13.

Artigo em Inglês | MEDLINE | ID: mdl-38654825

RESUMO

TBC1D3 is a primate-specific gene family that has expanded in the human lineage and has been implicated in neuronal progenitor proliferation and expansion of the frontal cortex. The gene family and its expression have been challenging to investigate because it is embedded in high-identity and highly variable segmental duplications. We sequenced and assembled the gene family using long-read sequencing data from 34 humans and 11 nonhuman primate species. Our analysis shows that this particular gene family has independently duplicated in at least five primate lineages, and the duplicated loci are enriched at sites of large-scale chromosomal rearrangements on chromosome 17. We find that most humans vary along two TBC1D3 clusters where human haplotypes are highly variable in copy number, differing by as many as 20 copies, and structure (structural heterozygosity 90%). We also show evidence of positive selection, as well as a significant change in the predicted human TBC1D3 protein sequence. Lastly, we find that, despite multiple duplications, human TBC1D3 expression is limited to a subset of copies and, most notably, from a single paralog group: TBC1D3-CDKL . These observations may help explain why a gene potentially important in cortical development can be so variable in the human population.

2.

Structurally divergent and recurrently mutated regions of primate genomes.

Mao, Yafei; Harvey, William T; Porubsky, David; Munson, Katherine M; Hoekzema, Kendra; Lewis, Alexandra P; Audano, Peter A; Rozanski, Allison; Yang, Xiangyu; Zhang, Shilong; Yoo, DongAhn; Gordon, David S; Fair, Tyler; Wei, Xiaoxi; Logsdon, Glennis A; Haukness, Marina; Dishuck, Philip C; Jeong, Hyeonsoo; Del Rosario, Ricardo; Bauer, Vanessa L; Fattor, Will T; Wilkerson, Gregory K; Mao, Yuxiang; Shi, Yongyong; Sun, Qiang; Lu, Qing; Paten, Benedict; Bakken, Trygve E; Pollen, Alex A; Feng, Guoping; Sawyer, Sara L; Warren, Wesley C; Carbone, Lucia; Eichler, Evan E.

Cell ; 187(6): 1547-1562.e13, 2024 Mar 14.

Artigo em Inglês | MEDLINE | ID: mdl-38428424

RESUMO

We sequenced and assembled using multiple long-read sequencing technologies the genomes of chimpanzee, bonobo, gorilla, orangutan, gibbon, macaque, owl monkey, and marmoset. We identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. We estimate that 819.47 Mbp or â¼27% of the genome has been affected by SVs across primate evolution. We identify 1,607 structurally divergent regions wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (e.g., CARD, C4, and OLAH gene families) and additional lineage-specific genes are generated (e.g., CKAP2, VPS36, ACBD7, and NEK5 paralogs), becoming targets of rapid chromosomal diversification and positive selection (e.g., RGPD gene family). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species.

Assuntos

Genoma , Primatas , Animais , Humanos , Sequência de Bases , Primatas/classificação , Primatas/genética , Evolução Biológica , Análise de Sequência de DNA , Variação Estrutural do Genoma

3.

The Complete Sequence and Comparative Analysis of Ape Sex Chromosomes.

Makova, Kateryna D; Pickett, Brandon D; Harris, Robert S; Hartley, Gabrielle A; Cechova, Monika; Pal, Karol; Nurk, Sergey; Yoo, DongAhn; Li, Qiuhui; Hebbar, Prajna; McGrath, Barbara C; Antonacci, Francesca; Aubel, Margaux; Biddanda, Arjun; Borchers, Matthew; Bomberg, Erich; Bouffard, Gerard G; Brooks, Shelise Y; Carbone, Lucia; Carrel, Laura; Carroll, Andrew; Chang, Pi-Chuan; Chin, Chen-Shan; Cook, Daniel E; Craig, Sarah J C; de Gennaro, Luciana; Diekhans, Mark; Dutra, Amalia; Garcia, Gage H; Grady, Patrick G S; Green, Richard E; Haddad, Diana; Hallast, Pille; Harvey, William T; Hickey, Glenn; Hillis, David A; Hoyt, Savannah J; Jeong, Hyeonsoo; Kamali, Kaivan; Kosakovsky Pond, Sergei L; LaPolice, Troy M; Lee, Charles; Lewis, Alexandra P; Loh, Yong-Hwee E; Masterson, Patrick; McCoy, Rajiv C; Medvedev, Paul; Miga, Karen H; Munson, Katherine M; Pak, Evgenia.

bioRxiv ; 2023 Dec 01.

Artigo em Inglês | MEDLINE | ID: mdl-38077089

RESUMO

Apes possess two sex chromosomes-the male-specific Y and the X shared by males and females. The Y chromosome is crucial for male reproduction, with deletions linked to infertility. The X chromosome carries genes vital for reproduction and cognition. Variation in mating patterns and brain function among great apes suggests corresponding differences in their sex chromosome structure and evolution. However, due to their highly repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the state-of-the-art experimental and computational methods developed for the telomere-to-telomere (T2T) human genome, we produced gapless, complete assemblies of the X and Y chromosomes for five great apes (chimpanzee, bonobo, gorilla, Bornean and Sumatran orangutans) and a lesser ape, the siamang gibbon. These assemblies completely resolved ampliconic, palindromic, and satellite sequences, including the entire centromeres, allowing us to untangle the intricacies of ape sex chromosome evolution. We found that, compared to the X, ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements. This divergence on the Y arises from the accumulation of lineage-specific ampliconic regions and palindromes (which are shared more broadly among species on the X) and from the abundance of transposable elements and satellites (which have a lower representation on the X). Our analysis of Y chromosome genes revealed lineage-specific expansions of multi-copy gene families and signatures of purifying selection. In summary, the Y exhibits dynamic evolution, while the X is more stable. Finally, mapping short-read sequencing data from >100 great ape individuals revealed the patterns of diversity and selection on their sex chromosomes, demonstrating the utility of these reference assemblies for studies of great ape evolution. These complete sex chromosome assemblies are expected to further inform conservation genetics of nonhuman apes, all of which are endangered species.

4.

Assembly of 43 human Y chromosomes reveals extensive complexity and variation.

Hallast, Pille; Ebert, Peter; Loftus, Mark; Yilmaz, Feyza; Audano, Peter A; Logsdon, Glennis A; Bonder, Marc Jan; Zhou, Weichen; Höps, Wolfram; Kim, Kwondo; Li, Chong; Hoyt, Savannah J; Dishuck, Philip C; Porubsky, David; Tsetsos, Fotios; Kwon, Jee Young; Zhu, Qihui; Munson, Katherine M; Hasenfeld, Patrick; Harvey, William T; Lewis, Alexandra P; Kordosky, Jennifer; Hoekzema, Kendra; O'Neill, Rachel J; Korbel, Jan O; Tyler-Smith, Chris; Eichler, Evan E; Shi, Xinghua; Beck, Christine R; Marschall, Tobias; Konkel, Miriam K; Lee, Charles.

Nature ; 621(7978): 355-364, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-37612510

RESUMO

The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date1 and led to its systematic omission from genomic analyses. Here we present de novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes2. Ampliconic sequences associated with these inversions show differing mutation rates that are sequence context dependent, and some ampliconic genes exhibit evidence for concerted evolution with the acquisition and purging of lineage-specific pseudogenes. The largest heterochromatic region in the human genome, Yq12, is composed of alternating repeat arrays that show extensive variation in the number, size and distribution, but retain a 1:1 copy-number ratio. Finally, our data suggest that the boundary between the recombining pseudoautosomal region 1 and the non-recombining portions of the X and Y chromosomes lies 500 kb away from the currently established1 boundary. The availability of fully sequence-resolved Y chromosomes from multiple individuals provides a unique opportunity for identifying new associations of traits with specific Y-chromosomal variants and garnering insights into the evolution and function of complex regions of the human genome.

Assuntos

Cromossomos Humanos Y , Evolução Molecular , Humanos , Masculino , Cromossomos Humanos Y/genética , Genoma Humano/genética , Genômica , Taxa de Mutação , Fenótipo , Eucromatina/genética , Pseudogenes , Variação Genética/genética , Cromossomos Humanos X/genética , Regiões Pseudoautossômicas/genética

5.

The complete sequence of a human Y chromosome.

Rhie, Arang; Nurk, Sergey; Cechova, Monika; Hoyt, Savannah J; Taylor, Dylan J; Altemose, Nicolas; Hook, Paul W; Koren, Sergey; Rautiainen, Mikko; Alexandrov, Ivan A; Allen, Jamie; Asri, Mobin; Bzikadze, Andrey V; Chen, Nae-Chyun; Chin, Chen-Shan; Diekhans, Mark; Flicek, Paul; Formenti, Giulio; Fungtammasan, Arkarachai; Garcia Giron, Carlos; Garrison, Erik; Gershman, Ariel; Gerton, Jennifer L; Grady, Patrick G S; Guarracino, Andrea; Haggerty, Leanne; Halabian, Reza; Hansen, Nancy F; Harris, Robert; Hartley, Gabrielle A; Harvey, William T; Haukness, Marina; Heinz, Jakob; Hourlier, Thibaut; Hubley, Robert M; Hunt, Sarah E; Hwang, Stephen; Jain, Miten; Kesharwani, Rupesh K; Lewis, Alexandra P; Li, Heng; Logsdon, Glennis A; Lucas, Julian K; Makalowski, Wojciech; Markovic, Christopher; Martin, Fergal J; Mc Cartney, Ann M; McCoy, Rajiv C; McDaniel, Jennifer; McNulty, Brandy M.

Nature ; 621(7978): 344-354, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-37612512

RESUMO

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.

Assuntos

Cromossomos Humanos Y , Genômica , Análise de Sequência de DNA , Humanos , Sequência de Bases , Cromossomos Humanos Y/genética , DNA Satélite/genética , Variação Genética/genética , Genética Populacional , Genômica/métodos , Genômica/normas , Heterocromatina/genética , Família Multigênica/genética , Padrões de Referência , Duplicações Segmentares Genômicas/genética , Análise de Sequência de DNA/normas , Sequências de Repetição em Tandem/genética , Telômero/genética

6.

Increased mutation and gene conversion within human segmental duplications.

Vollger, Mitchell R; Dishuck, Philip C; Harvey, William T; DeWitt, William S; Guitart, Xavi; Goldberg, Michael E; Rozanski, Allison N; Lucas, Julian; Asri, Mobin; Munson, Katherine M; Lewis, Alexandra P; Hoekzema, Kendra; Logsdon, Glennis A; Porubsky, David; Paten, Benedict; Harris, Kelley; Hsieh, PingHsun; Eichler, Evan E.

Nature ; 617(7960): 325-334, 2023 05.

Artigo em Inglês | MEDLINE | ID: mdl-37165237

RESUMO

Single-nucleotide variants (SNVs) in segmental duplications (SDs) have not been systematically assessed because of the limitations of mapping short-read sequencing data1,2. Here we constructed 1:1 unambiguous alignments spanning high-identity SDs across 102 human haplotypes and compared the pattern of SNVs between unique and duplicated regions3,4. We find that human SNVs are elevated 60% in SDs compared to unique regions and estimate that at least 23% of this increase is due to interlocus gene conversion (IGC) with up to 4.3 megabase pairs of SD sequence converted on average per human haplotype. We develop a genome-wide map of IGC donors and acceptors, including 498 acceptor and 454 donor hotspots affecting the exons of about 800 protein-coding genes. These include 171 genes that have 'relocated' on average 1.61 megabase pairs in a subset of human haplotypes. Using a coalescent framework, we show that SD regions are slightly evolutionarily older when compared to unique sequences, probably owing to IGC. SNVs in SDs, however, show a distinct mutational spectrum: a 27.1% increase in transversions that convert cytosine to guanine or the reverse across all triplet contexts and a 7.6% reduction in the frequency of CpG-associated mutations when compared to unique DNA. We reason that these distinct mutational properties help to maintain an overall higher GC content of SD DNA compared to that of unique DNA, probably driven by GC-biased conversion between paralogous sequences5,6.

Assuntos

Conversão Gênica , Mutação , Duplicações Segmentares Genômicas , Humanos , Conversão Gênica/genética , Genoma Humano/genética , Polimorfismo de Nucleotídeo Único/genética , Haplótipos/genética , Éxons/genética , Citosina/química , Guanina/química , Ilhas de CpG/genética

7.

Structurally divergent and recurrently mutated regions of primate genomes.

Mao, Yafei; Harvey, William T; Porubsky, David; Munson, Katherine M; Hoekzema, Kendra; Lewis, Alexandra P; Audano, Peter A; Rozanski, Allison; Yang, Xiangyu; Zhang, Shilong; Gordon, David S; Wei, Xiaoxi; Logsdon, Glennis A; Haukness, Marina; Dishuck, Philip C; Jeong, Hyeonsoo; Del Rosario, Ricardo; Bauer, Vanessa L; Fattor, Will T; Wilkerson, Gregory K; Lu, Qing; Paten, Benedict; Feng, Guoping; Sawyer, Sara L; Warren, Wesley C; Carbone, Lucia; Eichler, Evan E.

bioRxiv ; 2023 Mar 07.

Artigo em Inglês | MEDLINE | ID: mdl-36945442

RESUMO

To better understand the pattern of primate genome structural variation, we sequenced and assembled using multiple long-read sequencing technologies the genomes of eight nonhuman primate species, including New World monkeys (owl monkey and marmoset), Old World monkey (macaque), Asian apes (orangutan and gibbon), and African ape lineages (gorilla, bonobo, and chimpanzee). Compared to the human genome, we identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. Across 50 million years of primate evolution, we estimate that 819.47 Mbp or ~27% of the genome has been affected by SVs based on analysis of these primate lineages. We identify 1,607 structurally divergent regions (SDRs) wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (CARDs, ABCD7, OLAH) and new lineage-specific genes are generated (e.g., CKAP2, NEK5) and have become targets of rapid chromosomal diversification and positive selection (e.g., RGPDs). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species for the first time.

8.

Segmental duplications and their variation in a complete human genome.

Vollger, Mitchell R; Guitart, Xavi; Dishuck, Philip C; Mercuri, Ludovica; Harvey, William T; Gershman, Ariel; Diekhans, Mark; Sulovari, Arvis; Munson, Katherine M; Lewis, Alexandra P; Hoekzema, Kendra; Porubsky, David; Li, Ruiyang; Nurk, Sergey; Koren, Sergey; Miga, Karen H; Phillippy, Adam M; Timp, Winston; Ventura, Mario; Eichler, Evan E.

Science ; 376(6588): eabj6965, 2022 04.

Artigo em Inglês | MEDLINE | ID: mdl-35357917

RESUMO

Despite their importance in disease and evolution, highly identical segmental duplications (SDs) are among the last regions of the human reference genome (GRCh38) to be fully sequenced. Using a complete telomere-to-telomere human genome (T2T-CHM13), we present a comprehensive view of human SD organization. SDs account for nearly one-third of the additional sequence, increasing the genome-wide estimate from 5.4 to 7.0% [218 million base pairs (Mbp)]. An analysis of 268 human genomes shows that 91% of the previously unresolved T2T-CHM13 SD sequence (68.3 Mbp) better represents human copy number variation. Comparing long-read assemblies from human (n = 12) and nonhuman primate (n = 5) genomes, we systematically reconstruct the evolution and structural haplotype diversity of biomedically relevant and duplicated genes. This analysis reveals patterns of structural heterozygosity and evolutionary differences in SD organization between humans and other primates.

Assuntos

Variações do Número de Cópias de DNA , Duplicação Gênica , Genoma Humano , Duplicações Segmentares Genômicas , Evolução Molecular , Proteínas Ativadoras de GTPase/genética , Humanos , Polimorfismo de Nucleotídeo Único , Proteínas Proto-Oncogênicas/genética

9.

Familial long-read sequencing increases yield of de novo mutations.

Noyes, Michelle D; Harvey, William T; Porubsky, David; Sulovari, Arvis; Li, Ruiyang; Rose, Nicholas R; Audano, Peter A; Munson, Katherine M; Lewis, Alexandra P; Hoekzema, Kendra; Mantere, Tuomo; Graves-Lindsay, Tina A; Sanders, Ashley D; Goodwin, Sara; Kramer, Melissa; Mokrab, Younes; Zody, Michael C; Hoischen, Alexander; Korbel, Jan O; McCombie, W Richard; Eichler, Evan E.

Am J Hum Genet ; 109(4): 631-646, 2022 04 07.

Artigo em Inglês | MEDLINE | ID: mdl-35290762

RESUMO

Studies of de novo mutation (DNM) have typically excluded some of the most repetitive and complex regions of the genome because these regions cannot be unambiguously mapped with short-read sequencing data. To better understand the genome-wide pattern of DNM, we generated long-read sequence data from an autism parent-child quad with an affected female where no pathogenic variant had been discovered in short-read Illumina sequence data. We deeply sequenced all four individuals by using three sequencing platforms (Illumina, Oxford Nanopore, and Pacific Biosciences) and three complementary technologies (Strand-seq, optical mapping, and 10X Genomics). Using long-read sequencing, we initially discovered and validated 171 DNMs across two children-a 20% increase in the number of de novo single-nucleotide variants (SNVs) and indels when compared to short-read callsets. The number of DNMs further increased by 5% when considering a more complete human reference (T2T-CHM13) because of the recovery of events in regions absent from GRCh38 (e.g., three DNMs in heterochromatic satellites). In total, we validated 195 de novo germline mutations and 23 potential post-zygotic mosaic mutations across both children; the overall true substitution rate based on this integrated callset is at least 1.41 × 10-8 substitutions per nucleotide per generation. We also identified six de novo insertions and deletions in tandem repeats, two of which represent structural variants. We demonstrate that long-read sequencing and assembly, especially when combined with a more complete reference genome, increases the number of DNMs by >25% compared to previous studies, providing a more complete catalog of DNM compared to short-read data alone.

Assuntos

Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Feminino , Humanos , Mutação/genética , Nucleotídeos , Análise de Sequência de DNA , Software

10.

Evidence for opposing selective forces operating on human-specific duplicated TCAF genes in Neanderthals and humans.

Hsieh, PingHsun; Dang, Vy; Vollger, Mitchell R; Mao, Yafei; Huang, Tzu-Hsueh; Dishuck, Philip C; Baker, Carl; Cantsilieris, Stuart; Lewis, Alexandra P; Munson, Katherine M; Sorensen, Melanie; Welch, AnneMarie E; Underwood, Jason G; Eichler, Evan E.

Nat Commun ; 12(1): 5118, 2021 08 25.

Artigo em Inglês | MEDLINE | ID: mdl-34433829

RESUMO

TRP channel-associated factor 1/2 (TCAF1/TCAF2) proteins antagonistically regulate the cold-sensor protein TRPM8 in multiple human tissues. Understanding their significance has been complicated given the locus spans a gap-ridden region with complex segmental duplications in GRCh38. Using long-read sequencing, we sequence-resolve the locus, annotate full-length TCAF models in primate genomes, and show substantial human-specific TCAF copy number variation. We identify two human super haplogroups, H4 and H5, and establish that TCAF duplications originated ~1.7 million years ago but diversified only in Homo sapiens by recurrent structural mutations. Conversely, in all archaic-hominin samples the fixation for a specific H4 haplotype without duplication is likely due to positive selection. Here, our results of TCAF copy number expansion, selection signals in hominins, and differential TCAF2 expression between haplogroups and high TCAF2 and TRPM8 expression in liver and prostate in modern-day humans imply TCAF diversification among hominins potentially in response to cold or dietary adaptations.

Assuntos

Duplicação Gênica , Hominidae/genética , Proteínas de Membrana/genética , Seleção Genética , Animais , Variações do Número de Cópias de DNA , Evolução Molecular , Genoma Humano , Haplótipos , Humanos , Homem de Neandertal , Filogenia

11.

Targeted long-read sequencing identifies missing disease-causing variation.

Miller, Danny E; Sulovari, Arvis; Wang, Tianyun; Loucks, Hailey; Hoekzema, Kendra; Munson, Katherine M; Lewis, Alexandra P; Fuerte, Edith P Almanza; Paschal, Catherine R; Walsh, Tom; Thies, Jenny; Bennett, James T; Glass, Ian; Dipple, Katrina M; Patterson, Karynne; Bonkowski, Emily S; Nelson, Zoe; Squire, Audrey; Sikes, Megan; Beckman, Erika; Bennett, Robin L; Earl, Dawn; Lee, Winston; Allikmets, Rando; Perlman, Seth J; Chow, Penny; Hing, Anne V; Wenger, Tara L; Adam, Margaret P; Sun, Angela; Lam, Christina; Chang, Irene; Zou, Xue; Austin, Stephanie L; Huggins, Erin; Safi, Alexias; Iyengar, Apoorva K; Reddy, Timothy E; Majoros, William H; Allen, Andrew S; Crawford, Gregory E; Kishnani, Priya S; King, Mary-Claire; Cherry, Tim; Chong, Jessica X; Bamshad, Michael J; Nickerson, Deborah A; Mefford, Heather C; Doherty, Dan; Eichler, Evan E.

Am J Hum Genet ; 108(8): 1436-1449, 2021 08 05.

Artigo em Inglês | MEDLINE | ID: mdl-34216551

RESUMO

Despite widespread clinical genetic testing, many individuals with suspected genetic conditions lack a precise diagnosis, limiting their opportunity to take advantage of state-of-the-art treatments. In some cases, testing reveals difficult-to-evaluate structural differences, candidate variants that do not fully explain the phenotype, single pathogenic variants in recessive disorders, or no variants in genes of interest. Thus, there is a need for better tools to identify a precise genetic diagnosis in individuals when conventional testing approaches have been exhausted. We performed targeted long-read sequencing (T-LRS) using adaptive sampling on the Oxford Nanopore platform on 40 individuals, 10 of whom lacked a complete molecular diagnosis. We computationally targeted up to 151 Mbp of sequence per individual and searched for pathogenic substitutions, structural variants, and methylation differences using a single data source. We detected all genomic aberrations-including single-nucleotide variants, copy number changes, repeat expansions, and methylation differences-identified by prior clinical testing. In 8/8 individuals with complex structural rearrangements, T-LRS enabled more precise resolution of the mutation, leading to changes in clinical management in one case. In ten individuals with suspected Mendelian conditions lacking a precise genetic diagnosis, T-LRS identified pathogenic or likely pathogenic variants in six and variants of uncertain significance in two others. T-LRS accurately identifies pathogenic structural variants, resolves complex rearrangements, and identifies Mendelian variants not detected by other technologies. T-LRS represents an efficient and cost-effective strategy to evaluate high-priority genes and regions or complex clinical testing results.

Assuntos

Aberrações Cromossômicas , Análise Citogenética/métodos , Doenças Genéticas Inatas/diagnóstico , Doenças Genéticas Inatas/genética , Predisposição Genética para Doença , Genoma Humano , Mutação , Variações do Número de Cópias de DNA , Feminino , Testes Genéticos , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Cariotipagem , Masculino , Análise de Sequência de DNA

12.

A high-quality bonobo genome refines the analysis of hominid evolution.

Mao, Yafei; Catacchio, Claudia R; Hillier, LaDeana W; Porubsky, David; Li, Ruiyang; Sulovari, Arvis; Fernandes, Jason D; Montinaro, Francesco; Gordon, David S; Storer, Jessica M; Haukness, Marina; Fiddes, Ian T; Murali, Shwetha Canchi; Dishuck, Philip C; Hsieh, PingHsun; Harvey, William T; Audano, Peter A; Mercuri, Ludovica; Piccolo, Ilaria; Antonacci, Francesca; Munson, Katherine M; Lewis, Alexandra P; Baker, Carl; Underwood, Jason G; Hoekzema, Kendra; Huang, Tzu-Hsueh; Sorensen, Melanie; Walker, Jerilyn A; Hoffman, Jinna; Thibaud-Nissen, Françoise; Salama, Sofie R; Pang, Andy W C; Lee, Joyce; Hastie, Alex R; Paten, Benedict; Batzer, Mark A; Diekhans, Mark; Ventura, Mario; Eichler, Evan E.

Nature ; 594(7861): 77-81, 2021 06.

Artigo em Inglês | MEDLINE | ID: mdl-33953399

RESUMO

The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation1,2. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes1,3-5 and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome.

Assuntos

Evolução Molecular , Genoma/genética , Genômica , Pan paniscus/genética , Filogenia , Animais , Fator de Iniciação 4A em Eucariotos/genética , Feminino , Genes , Gorilla gorilla/genética , Anotação de Sequência Molecular/normas , Pan troglodytes/genética , Pongo/genética , Duplicações Segmentares Genômicas , Análise de Sequência de DNA

13.

Haplotype-resolved diverse human genomes and integrated analysis of structural variation.

Ebert, Peter; Audano, Peter A; Zhu, Qihui; Rodriguez-Martin, Bernardo; Porubsky, David; Bonder, Marc Jan; Sulovari, Arvis; Ebler, Jana; Zhou, Weichen; Serra Mari, Rebecca; Yilmaz, Feyza; Zhao, Xuefang; Hsieh, PingHsun; Lee, Joyce; Kumar, Sushant; Lin, Jiadong; Rausch, Tobias; Chen, Yu; Ren, Jingwen; Santamarina, Martin; Höps, Wolfram; Ashraf, Hufsah; Chuang, Nelson T; Yang, Xiaofei; Munson, Katherine M; Lewis, Alexandra P; Fairley, Susan; Tallon, Luke J; Clarke, Wayne E; Basile, Anna O; Byrska-Bishop, Marta; Corvelo, André; Evani, Uday S; Lu, Tsung-Yu; Chaisson, Mark J P; Chen, Junjie; Li, Chong; Brand, Harrison; Wenger, Aaron M; Ghareghani, Maryam; Harvey, William T; Raeder, Benjamin; Hasenfeld, Patrick; Regier, Allison A; Abel, Haley J; Hall, Ira M; Flicek, Paul; Stegle, Oliver; Gerstein, Mark B; Tubio, Jose M C.

Science ; 372(6537)2021 04 02.

Artigo em Inglês | MEDLINE | ID: mdl-33632895

RESUMO

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% of the genome: 26 million base pairs) integrate all forms of genetic variation, even across complex loci. We identified 107,590 structural variants (SVs), of which 68% were not discovered with short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterized 130 of the most active mobile element source elements and found that 63% of all SVs arise through homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.

Assuntos

Variação Genética , Genoma Humano , Haplótipos , Feminino , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Mutação INDEL , Sequências Repetitivas Dispersas , Masculino , Grupos Populacionais/genética , Locos de Características Quantitativas , Retroelementos , Análise de Sequência de DNA , Inversão de Sequência , Sequenciamento Completo do Genoma

14.

Sequence diversity analyses of an improved rhesus macaque genome enhance its biomedical utility.

Warren, Wesley C; Harris, R Alan; Haukness, Marina; Fiddes, Ian T; Murali, Shwetha C; Fernandes, Jason; Dishuck, Philip C; Storer, Jessica M; Raveendran, Muthuswamy; Hillier, LaDeana W; Porubsky, David; Mao, Yafei; Gordon, David; Vollger, Mitchell R; Lewis, Alexandra P; Munson, Katherine M; DeVogelaere, Elizabeth; Armstrong, Joel; Diekhans, Mark; Walker, Jerilyn A; Tomlinson, Chad; Graves-Lindsay, Tina A; Kremitzki, Milinn; Salama, Sofie R; Audano, Peter A; Escalona, Merly; Maurer, Nicholas W; Antonacci, Francesca; Mercuri, Ludovica; Maggiolini, Flavia A M; Catacchio, Claudia Rita; Underwood, Jason G; O'Connor, David H; Sanders, Ashley D; Korbel, Jan O; Ferguson, Betsy; Kubisch, H Michael; Picker, Louis; Kalin, Ned H; Rosene, Douglas; Levine, Jon; Abbott, David H; Gray, Stanton B; Sanchez, Mar M; Kovacs-Balint, Zsofia A; Kemnitz, Joseph W; Thomasy, Sara M; Roberts, Jeffrey A; Kinnally, Erin L; Capitanio, John P.

Science ; 370(6523)2020 12 18.

Artigo em Inglês | MEDLINE | ID: mdl-33335035

RESUMO

The rhesus macaque (Macaca mulatta) is the most widely studied nonhuman primate (NHP) in biomedical research. We present an updated reference genome assembly (Mmul_10, contig N50 = 46 Mbp) that increases the sequence contiguity 120-fold and annotate it using 6.5 million full-length transcripts, thus improving our understanding of gene content, isoform diversity, and repeat organization. With the improved assembly of segmental duplications, we discovered new lineage-specific genes and expanded gene families that are potentially informative in studies of evolution and disease susceptibility. Whole-genome sequencing (WGS) data from 853 rhesus macaques identified 85.7 million single-nucleotide variants (SNVs) and 10.5 million indel variants, including potentially damaging variants in genes associated with human autism and developmental delay, providing a framework for developing noninvasive NHP models of human disease.

Assuntos

Predisposição Genética para Doença , Genoma , Macaca mulatta/genética , Polimorfismo de Nucleotídeo Único , Animais , Variação Genética , Humanos , Anotação de Sequência Molecular , Sequenciamento Completo do Genoma

15.

Adaptive archaic introgression of copy number variants and the discovery of previously unknown human genes.

Hsieh, PingHsun; Vollger, Mitchell R; Dang, Vy; Porubsky, David; Baker, Carl; Cantsilieris, Stuart; Hoekzema, Kendra; Lewis, Alexandra P; Munson, Katherine M; Sorensen, Melanie; Kronenberg, Zev N; Murali, Shwetha; Nelson, Bradley J; Chiatante, Giorgia; Maggiolini, Flavia Angela Maria; Blanché, Hélène; Underwood, Jason G; Antonacci, Francesca; Deleuze, Jean-François; Eichler, Evan E.

Science ; 366(6463)2019 10 18.

Artigo em Inglês | MEDLINE | ID: mdl-31624180

RESUMO

Copy number variants (CNVs) are subject to stronger selective pressure than single-nucleotide variants, but their roles in archaic introgression and adaptation have not been systematically investigated. We show that stratified CNVs are significantly associated with signatures of positive selection in Melanesians and provide evidence for adaptive introgression of large CNVs at chromosomes 16p11.2 and 8p21.3 from Denisovans and Neanderthals, respectively. Using long-read sequence data, we reconstruct the structure and complex evolutionary history of these polymorphisms and show that both encode positively selected genes absent from most human populations. Our results collectively suggest that large CNVs originating in archaic hominins and introgressed into modern humans have played an important role in local population adaptation and represent an insufficiently studied source of large-scale genetic variation.

Assuntos

Introgressão Genética , Animais , Duplicação Cromossômica , Cromossomos Humanos Par 16/genética , Cromossomos Humanos Par 8/genética , Variações do Número de Cópias de DNA , Evolução Molecular , Genoma Humano , Haplótipos , Hominidae/genética , Humanos , Melanesia , Modelos Genéticos , Homem de Neandertal/genética , Polimorfismo Genético , Seleção Genética , Sequenciamento Completo do Genoma

16.

Recurrent somatic loss of TNFRSF14 in classical Hodgkin lymphoma.

Salipante, Stephen J; Adey, Andrew; Thomas, Anju; Lee, Choli; Liu, Yajuan J; Kumar, Akash; Lewis, Alexandra P; Wu, David; Fromm, Jonathan R; Shendure, Jay.

Genes Chromosomes Cancer ; 55(3): 278-87, 2016 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-26650888

RESUMO

Investigation of the genetic lesions underlying classical Hodgkin lymphoma (CHL) has been challenging due to the rarity of Hodgkin and Reed-Sternberg (HRS) cells, the pathognomonic neoplastic cells of CHL. In an effort to catalog more comprehensively recurrent copy number alterations occurring during oncogenesis, we investigated somatic alterations involved in CHL using whole-genome sequencing-mediated copy number analysis of purified HRS cells. We performed low-coverage sequencing of small numbers of intact HRS cells and paired non-neoplastic B lymphocytes isolated by flow cytometric cell sorting from 19 primary cases, as well as two commonly used HRS-derived cell lines (KM-H2 and L1236). We found that HRS cells contain strikingly fewer copy number abnormalities than CHL cell lines. A subset of cases displayed nonintegral chromosomal copy number states, suggesting internal heterogeneity within the HRS cell population. Recurrent somatic copy number alterations involving known factors in CHL pathogenesis were identified (REL, the PD-1 pathway, and TNFAIP3). In eight cases (42%) we observed recurrent copy number loss of chr1:2,352,236-4,574,271, a region containing the candidate tumor suppressor TNFRSF14. Using flow cytometry, we demonstrated reduced TNFRSF14 expression in HRS cells from 5 of 22 additional cases (23%) and in two of three CHL cell lines. These studies suggest that TNFRSF14 dysregulation may contribute to the pathobiology of CHL in a subset of cases.

Assuntos

Doença de Hodgkin/genética , Membro 14 de Receptores do Fator de Necrose Tumoral/genética , Linhagem Celular Tumoral , Separação Celular , Citometria de Fluxo , Doença de Hodgkin/metabolismo , Humanos , Análise de Sequência com Séries de Oligonucleotídeos , Membro 14 de Receptores do Fator de Necrose Tumoral/biossíntese , Membro 14 de Receptores do Fator de Necrose Tumoral/deficiência , Células de Reed-Sternberg

17.

Whole genome prediction for preimplantation genetic diagnosis.

Kumar, Akash; Ryan, Allison; Kitzman, Jacob O; Wemmer, Nina; Snyder, Matthew W; Sigurjonsson, Styrmir; Lee, Choli; Banjevic, Milena; Zarutskie, Paul W; Lewis, Alexandra P; Shendure, Jay; Rabinowitz, Matthew.

Genome Med ; 7(1): 35, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26019723

RESUMO

BACKGROUND: Preimplantation genetic diagnosis (PGD) enables profiling of embryos for genetic disorders prior to implantation. The majority of PGD testing is restricted in the scope of variants assayed or by the availability of extended family members. While recent advances in single cell sequencing show promise, they remain limited by bias in DNA amplification and the rapid turnaround time (<36 h) required for fresh embryo transfer. Here, we describe and validate a method for inferring the inherited whole genome sequence of an embryo for preimplantation genetic diagnosis (PGD). METHODS: We combine haplotype-resolved, parental genome sequencing with rapid embryo genotyping to predict the whole genome sequence of a day-5 human embryo in a couple at risk of transmitting alpha-thalassemia. RESULTS: Inheritance was predicted at approximately 3 million paternally and/or maternally heterozygous sites with greater than 99% accuracy. Furthermore, we successfully phase and predict the transmission of an HBA1/HBA2 deletion from each parent. CONCLUSIONS: Our results suggest that preimplantation whole genome prediction may facilitate the comprehensive diagnosis of diseases with a known genetic basis in embryos.

18.

Mutations in SPAG1 cause primary ciliary dyskinesia associated with defective outer and inner dynein arms.

Knowles, Michael R; Ostrowski, Lawrence E; Loges, Niki T; Hurd, Toby; Leigh, Margaret W; Huang, Lu; Wolf, Whitney E; Carson, Johnny L; Hazucha, Milan J; Yin, Weining; Davis, Stephanie D; Dell, Sharon D; Ferkol, Thomas W; Sagel, Scott D; Olivier, Kenneth N; Jahnke, Charlotte; Olbrich, Heike; Werner, Claudius; Raidt, Johanna; Wallmeier, Julia; Pennekamp, Petra; Dougherty, Gerard W; Hjeij, Rim; Gee, Heon Yung; Otto, Edgar A; Halbritter, Jan; Chaki, Moumita; Diaz, Katrina A; Braun, Daniela A; Porath, Jonathan D; Schueler, Markus; Baktai, György; Griese, Matthias; Turner, Emily H; Lewis, Alexandra P; Bamshad, Michael J; Nickerson, Deborah A; Hildebrandt, Friedhelm; Shendure, Jay; Omran, Heymut; Zariwala, Maimoona A.

Am J Hum Genet ; 93(4): 711-20, 2013 Oct 03.

Artigo em Inglês | MEDLINE | ID: mdl-24055112

RESUMO

Primary ciliary dyskinesia (PCD) is a genetically heterogeneous, autosomal-recessive disorder, characterized by oto-sino-pulmonary disease and situs abnormalities. PCD-causing mutations have been identified in 20 genes, but collectively they account for only â¼65% of all PCDs. To identify mutations in additional genes that cause PCD, we performed exome sequencing on three unrelated probands with ciliary outer and inner dynein arm (ODA+IDA) defects. Mutations in SPAG1 were identified in one family with three affected siblings. Further screening of SPAG1 in 98 unrelated affected individuals (62 with ODA+IDA defects, 35 with ODA defects, 1 without available ciliary ultrastructure) revealed biallelic loss-of-function mutations in 11 additional individuals (including one sib-pair). All 14 affected individuals with SPAG1 mutations had a characteristic PCD phenotype, including 8 with situs abnormalities. Additionally, all individuals with mutations who had defined ciliary ultrastructure had ODA+IDA defects. SPAG1 was present in human airway epithelial cell lysates but was not present in isolated axonemes, and immunofluorescence staining showed an absence of ODA and IDA proteins in cilia from an affected individual, thus indicating that SPAG1 probably plays a role in the cytoplasmic assembly and/or trafficking of the axonemal dynein arms. Zebrafish morpholino studies of spag1 produced cilia-related phenotypes previously reported for PCD-causing mutations in genes encoding cytoplasmic proteins. Together, these results demonstrate that mutations in SPAG1 cause PCD with ciliary ODA+IDA defects and that exome sequencing is useful to identify genetic causes of heterogeneous recessive disorders.

Assuntos

Antígenos de Superfície/genética , Cílios/genética , Transtornos da Motilidade Ciliar/genética , Dineínas/genética , Proteínas de Ligação ao GTP/genética , Síndrome de Kartagener/genética , Mutação/genética , Adolescente , Adulto , Animais , Axonema/genética , Criança , Pré-Escolar , Citoplasma/genética , Células Epiteliais/metabolismo , Exoma , Feminino , Humanos , Lactente , Masculino , Linhagem , Fenótipo , Adulto Jovem , Peixe-Zebra

19.

The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line.

Adey, Andrew; Burton, Joshua N; Kitzman, Jacob O; Hiatt, Joseph B; Lewis, Alexandra P; Martin, Beth K; Qiu, Ruolan; Lee, Choli; Shendure, Jay.

Nature ; 500(7461): 207-11, 2013 Aug 08.

Artigo em Inglês | MEDLINE | ID: mdl-23925245

RESUMO

The HeLa cell line was established in 1951 from cervical cancer cells taken from a patient, Henrietta Lacks. This was the first successful attempt to immortalize human-derived cells in vitro. The robust growth and unrestricted distribution of HeLa cells resulted in its broad adoption--both intentionally and through widespread cross-contamination--and for the past 60 years it has served a role analogous to that of a model organism. The cumulative impact of the HeLa cell line on research is demonstrated by its occurrence in more than 74,000 PubMed abstracts (approximately 0.3%). The genomic architecture of HeLa remains largely unexplored beyond its karyotype, partly because like many cancers, its extensive aneuploidy renders such analyses challenging. We carried out haplotype-resolved whole-genome sequencing of the HeLa CCL-2 strain, examined point- and indel-mutation variations, mapped copy-number variations and loss of heterozygosity regions, and phased variants across full chromosome arms. We also investigated variation and copy-number profiles for HeLa S3 and eight additional strains. We find that HeLa is relatively stable in terms of point variation, with few new mutations accumulating after early passaging. Haplotype resolution facilitated reconstruction of an amplified, highly rearranged region of chromosome 8q24.21 at which integration of the human papilloma virus type 18 (HPV-18) genome occurred and that is likely to be the event that initiated tumorigenesis. We combined these maps with RNA-seq and ENCODE Project data sets to phase the HeLa epigenome. This revealed strong, haplotype-specific activation of the proto-oncogene MYC by the integrated HPV-18 genome approximately 500 kilobases upstream, and enabled global analyses of the relationship between gene dosage and expression. These data provide an extensively phased, high-quality reference genome for past and future experiments relying on HeLa, and demonstrate the value of haplotype resolution for characterizing cancer genomes and epigenomes.

Assuntos

Epigenômica , Genoma Humano/genética , Aneuploidia , Variações do Número de Cópias de DNA , Feminino , Genes myc/genética , Haplótipos , Células HeLa , Papillomavirus Humano 18/genética , Papillomavirus Humano 18/fisiologia , Humanos , Dados de Sequência Molecular , Mutação , Proto-Oncogene Mas , Análise de Sequência de DNA , Ativação Transcricional/genética , Neoplasias do Colo do Útero/genética , Neoplasias do Colo do Útero/patologia , Neoplasias do Colo do Útero/virologia

20.

Exome sequencing identifies mutations in CCDC114 as a cause of primary ciliary dyskinesia.

Knowles, Michael R; Leigh, Margaret W; Ostrowski, Lawrence E; Huang, Lu; Carson, Johnny L; Hazucha, Milan J; Yin, Weining; Berg, Jonathan S; Davis, Stephanie D; Dell, Sharon D; Ferkol, Thomas W; Rosenfeld, Margaret; Sagel, Scott D; Milla, Carlos E; Olivier, Kenneth N; Turner, Emily H; Lewis, Alexandra P; Bamshad, Michael J; Nickerson, Deborah A; Shendure, Jay; Zariwala, Maimoona A.

Am J Hum Genet ; 92(1): 99-106, 2013 Jan 10.

Artigo em Inglês | MEDLINE | ID: mdl-23261302

RESUMO

Primary ciliary dyskinesia (PCD) is a genetically heterogeneous, autosomal-recessive disorder, characterized by oto-sino-pulmonary disease and situs abnormalities. PCD-causing mutations have been identified in 14 genes, but they collectively account for only ~60% of all PCD. To identify mutations that cause PCD, we performed exome sequencing on six unrelated probands with ciliary outer dynein arm (ODA) defects. Mutations in CCDC114, an ortholog of the Chlamydomonas reinhardtii motility gene DCC2, were identified in a family with two affected siblings. Sanger sequencing of 67 additional individuals with PCD with ODA defects from 58 families revealed CCDC114 mutations in 4 individuals in 3 families. All 6 individuals with CCDC114 mutations had characteristic oto-sino-pulmonary disease, but none had situs abnormalities. In the remaining 5 individuals with PCD who underwent exome sequencing, we identified mutations in two genes (DNAI2, DNAH5) known to cause PCD, including an Ashkenazi Jewish founder mutation in DNAI2. These results revealed that mutations in CCDC114 are a cause of ciliary dysmotility and PCD and further demonstrate the utility of exome sequencing to identify genetic causes in heterogeneous recessive disorders.

Assuntos

Síndrome de Kartagener/genética , Proteínas Associadas aos Microtúbulos/genética , Mutação , Adulto , Pré-Escolar , Exoma , Feminino , Genes Recessivos , Humanos , Masculino , Pessoa de Meia-Idade , Linhagem , Isoformas de Proteínas , Análise de Sequência de DNA

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA